{ "cells": [ { "cell_type": "markdown", "id": "8ce6a75d", "metadata": {}, "source": [ "# Getting Started with Data Sources\n", "\n", "Welcome to **PyBroker**! The best place to start is to learn about [DataSources](https://www.pybroker.com/en/latest/reference/pybroker.data.html#pybroker.data.DataSource). A ```DataSource``` is a class that can fetch data from external sources, which you can then use to backtest your trading strategies.\n", "\n", "## Yahoo Finance\n", "\n", "One of the built-in ```DataSources``` in **PyBroker** is [Yahoo Finance](https://finance.yahoo.com). To use it, you can import [YFinance](https://www.pybroker.com/en/latest/reference/pybroker.data.html#pybroker.data.YFinance):" ] }, { "cell_type": "code", "execution_count": 1, "id": "f034d992", "metadata": { "ExecuteTime": { "end_time": "2023-08-15T11:35:36.017796Z", "start_time": "2023-08-15T11:35:32.119216400Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loading bar data...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[*********************100%%**********************] 2 of 2 completed" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Loaded bar data: 0:00:00 \n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
datesymbolopenhighlowclosevolumeadj_close
02021-03-01AAPL123.750000127.930000122.790001127.790001116307900125.599655
12021-03-01MSFT235.899994237.470001233.149994236.94000225324000230.847702
22021-03-02AAPL128.410004128.720001125.010002125.120003102260900122.975403
32021-03-02MSFT237.009995237.300003233.449997233.86999522812500227.856628
42021-03-03AAPL124.809998125.709999121.839996122.059998112966300119.967857
...........................
5012022-02-24MSFT272.510010295.160004271.519989294.58999656989700289.353271
5022022-02-25AAPL163.839996165.119995160.869995164.85000691974200162.987427
5032022-02-25MSFT295.140015297.630005291.649994297.30999832546700292.024872
5042022-02-28AAPL163.059998165.419998162.429993165.11999595056600163.254364
5052022-02-28MSFT294.309998299.140015293.000000298.79000934627500293.478607
\n", "

506 rows × 8 columns

\n", "
" ], "text/plain": [ " date symbol open high low close \\\n", "0 2021-03-01 AAPL 123.750000 127.930000 122.790001 127.790001 \n", "1 2021-03-01 MSFT 235.899994 237.470001 233.149994 236.940002 \n", "2 2021-03-02 AAPL 128.410004 128.720001 125.010002 125.120003 \n", "3 2021-03-02 MSFT 237.009995 237.300003 233.449997 233.869995 \n", "4 2021-03-03 AAPL 124.809998 125.709999 121.839996 122.059998 \n", ".. ... ... ... ... ... ... \n", "501 2022-02-24 MSFT 272.510010 295.160004 271.519989 294.589996 \n", "502 2022-02-25 AAPL 163.839996 165.119995 160.869995 164.850006 \n", "503 2022-02-25 MSFT 295.140015 297.630005 291.649994 297.309998 \n", "504 2022-02-28 AAPL 163.059998 165.419998 162.429993 165.119995 \n", "505 2022-02-28 MSFT 294.309998 299.140015 293.000000 298.790009 \n", "\n", " volume adj_close \n", "0 116307900 125.599655 \n", "1 25324000 230.847702 \n", "2 102260900 122.975403 \n", "3 22812500 227.856628 \n", "4 112966300 119.967857 \n", ".. ... ... \n", "501 56989700 289.353271 \n", "502 91974200 162.987427 \n", "503 32546700 292.024872 \n", "504 95056600 163.254364 \n", "505 34627500 293.478607 \n", "\n", "[506 rows x 8 columns]" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from pybroker import YFinance\n", "\n", "yfinance = YFinance()\n", "df = yfinance.query(['AAPL', 'MSFT'], start_date='3/1/2021', end_date='3/1/2022')\n", "df" ] }, { "cell_type": "markdown", "id": "63f2031e", "metadata": {}, "source": [ "The above code queries data for AAPL and MSFT stocks, and returns a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) with the results.\n", "\n", "## Caching Data\n", "\n", "If you want to speed up your data retrieval, you can cache your queries using **PyBroker**'s caching system. You can enable caching by calling [pybroker.enable_data_source_cache('name')](https://www.pybroker.com/en/latest/reference/pybroker.cache.html#pybroker.cache.enable_data_source_cache) where ```name``` is the name of the cache you want to use:" ] }, { "cell_type": "code", "execution_count": 2, "id": "6b718f07", "metadata": { "ExecuteTime": { "end_time": "2023-08-15T11:35:36.102496400Z", "start_time": "2023-08-15T11:35:36.014759400Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pybroker\n", "\n", "pybroker.enable_data_source_cache('yfinance')" ] }, { "cell_type": "markdown", "id": "c35fbc8e", "metadata": {}, "source": [ "The next call to [query](https://www.pybroker.com/en/latest/reference/pybroker.data.html#pybroker.data.DataSource.query) will cache the returned data to disk. Each unique combination of ticker symbol and date range will be cached separately:" ] }, { "cell_type": "code", "execution_count": 3, "id": "7e07adbe", "metadata": { "ExecuteTime": { "end_time": "2023-08-15T11:35:37.340726600Z", "start_time": "2023-08-15T11:35:36.042740100Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loading bar data...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[*********************100%%**********************] 2 of 2 completed" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Loaded bar data: 0:00:00 \n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
datesymbolopenhighlowclosevolumeadj_close
02021-03-01IBM115.057358116.940727114.588913115.4302065977367100.173241
12021-03-01TSLA230.036667239.666672228.350006239.47666981408600239.476669
22021-03-02IBM115.430206116.539200114.971321115.038239473241899.833076
32021-03-02TSLA239.426666240.369995228.333328228.81333971196600228.813339
42021-03-03IBM115.200768117.237091114.703636116.9789667744898101.517288
...........................
5012022-02-24TSLA233.463333267.493347233.333328266.923340135322200266.923340
5022022-02-25IBM122.050003124.260002121.449997124.1800004460900113.041489
5032022-02-25TSLA269.743347273.166656260.799988269.95666576067700269.956665
5042022-02-28IBM122.209999123.389999121.040001122.5100026757300111.521271
5052022-02-28TSLA271.670013292.286682271.570007290.14334199006900290.143341
\n", "

506 rows × 8 columns

\n", "
" ], "text/plain": [ " date symbol open high low close \\\n", "0 2021-03-01 IBM 115.057358 116.940727 114.588913 115.430206 \n", "1 2021-03-01 TSLA 230.036667 239.666672 228.350006 239.476669 \n", "2 2021-03-02 IBM 115.430206 116.539200 114.971321 115.038239 \n", "3 2021-03-02 TSLA 239.426666 240.369995 228.333328 228.813339 \n", "4 2021-03-03 IBM 115.200768 117.237091 114.703636 116.978966 \n", ".. ... ... ... ... ... ... \n", "501 2022-02-24 TSLA 233.463333 267.493347 233.333328 266.923340 \n", "502 2022-02-25 IBM 122.050003 124.260002 121.449997 124.180000 \n", "503 2022-02-25 TSLA 269.743347 273.166656 260.799988 269.956665 \n", "504 2022-02-28 IBM 122.209999 123.389999 121.040001 122.510002 \n", "505 2022-02-28 TSLA 271.670013 292.286682 271.570007 290.143341 \n", "\n", " volume adj_close \n", "0 5977367 100.173241 \n", "1 81408600 239.476669 \n", "2 4732418 99.833076 \n", "3 71196600 228.813339 \n", "4 7744898 101.517288 \n", ".. ... ... \n", "501 135322200 266.923340 \n", "502 4460900 113.041489 \n", "503 76067700 269.956665 \n", "504 6757300 111.521271 \n", "505 99006900 290.143341 \n", "\n", "[506 rows x 8 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "yfinance.query(['TSLA', 'IBM'], '3/1/2021', '3/1/2022')" ] }, { "cell_type": "markdown", "id": "03b13620", "metadata": {}, "source": [ "Calling ```query``` again with the same ticker symbols and date range returns the cached data:" ] }, { "cell_type": "code", "execution_count": 4, "id": "33569200", "metadata": { "ExecuteTime": { "end_time": "2023-08-15T11:35:37.381132300Z", "start_time": "2023-08-15T11:35:37.338742900Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loaded cached bar data.\n", "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
datesymbolopenhighlowclosevolumeadj_close
02021-03-01IBM115.057358116.940727114.588913115.4302065977367100.173241
12021-03-02IBM115.430206116.539200114.971321115.038239473241899.833076
22021-03-03IBM115.200768117.237091114.703636116.9789667744898101.517288
32021-03-04IBM116.634796117.801147113.537285114.827919843965199.650551
42021-03-05IBM115.334610118.307838114.961761117.4282997268968101.907227
...........................
2482022-02-22TSLA278.043335285.576660267.033325273.84332383288100273.843323
2492022-02-23TSLA276.809998278.433319253.520004254.67999395256900254.679993
2502022-02-24TSLA233.463333267.493347233.333328266.923340135322200266.923340
2512022-02-25TSLA269.743347273.166656260.799988269.95666576067700269.956665
2522022-02-28TSLA271.670013292.286682271.570007290.14334199006900290.143341
\n", "

506 rows × 8 columns

\n", "
" ], "text/plain": [ " date symbol open high low close \\\n", "0 2021-03-01 IBM 115.057358 116.940727 114.588913 115.430206 \n", "1 2021-03-02 IBM 115.430206 116.539200 114.971321 115.038239 \n", "2 2021-03-03 IBM 115.200768 117.237091 114.703636 116.978966 \n", "3 2021-03-04 IBM 116.634796 117.801147 113.537285 114.827919 \n", "4 2021-03-05 IBM 115.334610 118.307838 114.961761 117.428299 \n", ".. ... ... ... ... ... ... \n", "248 2022-02-22 TSLA 278.043335 285.576660 267.033325 273.843323 \n", "249 2022-02-23 TSLA 276.809998 278.433319 253.520004 254.679993 \n", "250 2022-02-24 TSLA 233.463333 267.493347 233.333328 266.923340 \n", "251 2022-02-25 TSLA 269.743347 273.166656 260.799988 269.956665 \n", "252 2022-02-28 TSLA 271.670013 292.286682 271.570007 290.143341 \n", "\n", " volume adj_close \n", "0 5977367 100.173241 \n", "1 4732418 99.833076 \n", "2 7744898 101.517288 \n", "3 8439651 99.650551 \n", "4 7268968 101.907227 \n", ".. ... ... \n", "248 83288100 273.843323 \n", "249 95256900 254.679993 \n", "250 135322200 266.923340 \n", "251 76067700 269.956665 \n", "252 99006900 290.143341 \n", "\n", "[506 rows x 8 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = yfinance.query(['TSLA', 'IBM'], '3/1/2021', '3/1/2022')\n", "df" ] }, { "cell_type": "markdown", "id": "448ea556", "metadata": {}, "source": [ "You can clear your cache using [pybroker.clear_data_source_cache](https://www.pybroker.com/en/latest/reference/pybroker.cache.html#pybroker.cache.clear_data_source_cache):" ] }, { "cell_type": "code", "execution_count": 5, "id": "88641432", "metadata": { "ExecuteTime": { "end_time": "2023-08-15T11:35:37.412747300Z", "start_time": "2023-08-15T11:35:37.352743700Z" } }, "outputs": [], "source": [ "pybroker.clear_data_source_cache()" ] }, { "cell_type": "markdown", "id": "16a02ffe", "metadata": {}, "source": [ "Or disable caching altogether using [pybroker.disable_data_source_cache](https://www.pybroker.com/en/latest/reference/pybroker.cache.html#pybroker.cache.disable_data_source_cache):" ] }, { "cell_type": "code", "execution_count": 6, "id": "069afbdd", "metadata": { "ExecuteTime": { "end_time": "2023-08-15T11:35:37.412747300Z", "start_time": "2023-08-15T11:35:37.357876Z" } }, "outputs": [], "source": [ "pybroker.disable_data_source_cache()" ] }, { "cell_type": "markdown", "id": "3ddd46f1", "metadata": {}, "source": [ "Note that these calls should be made after first calling [pybroker.enable_data_source_cache](https://www.pybroker.com/en/latest/reference/pybroker.cache.html#pybroker.cache.enable_data_source_cache)." ] }, { "cell_type": "markdown", "id": "223c7148", "metadata": {}, "source": [ "## Alpaca\n", "\n", "**PyBroker** also includes an [Alpaca](https://alpaca.markets/) ```DataSource``` for fetching stock data. To use it, you can import [Alpaca](https://www.pybroker.com/en/latest/reference/pybroker.data.html#pybroker.data.Alpaca) and provide your API key and secret: " ] }, { "cell_type": "code", "execution_count": 7, "id": "3059a94d", "metadata": { "ExecuteTime": { "end_time": "2023-08-15T11:35:37.475437700Z", "start_time": "2023-08-15T11:35:37.361448900Z" } }, "outputs": [], "source": [ "from pybroker import Alpaca\n", "import os\n", "\n", "alpaca = Alpaca(os.environ['ALPACA_API_KEY'], os.environ['ALPACA_API_SECRET'])" ] }, { "cell_type": "markdown", "id": "e8eb5d95", "metadata": {}, "source": [ "You can query ```Alpaca``` for stock data using the same syntax as with Yahoo Finance, but Alpaca also supports querying data by different timeframes. For example, to query 1 minute data:" ] }, { "cell_type": "code", "execution_count": 8, "id": "0f74ea85", "metadata": { "ExecuteTime": { "end_time": "2023-08-15T11:36:23.802697500Z", "start_time": "2023-08-15T11:35:37.367308500Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loading bar data...\n", "Loaded bar data: 0:00:05 \n", "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
datesymbolopenhighlowclosevolumevwap
02021-03-01 04:00:00-05:00AAPL124.30124.56124.30124.5012267.0124.433365
12021-03-01 04:00:00-05:00MSFT235.87236.00235.87236.001429.0235.938887
22021-03-01 04:01:00-05:00AAPL124.56124.60124.30124.309439.0124.481323
32021-03-01 04:01:00-05:00MSFT236.17236.17236.17236.17104.0236.161538
42021-03-01 04:02:00-05:00AAPL124.00124.05123.78123.784834.0123.935583
...........................
333402021-03-31 19:57:00-04:00MSFT237.28237.28237.28237.28507.0237.367870
333412021-03-31 19:58:00-04:00AAPL122.36122.39122.33122.393403.0122.360544
333422021-03-31 19:58:00-04:00MSFT237.40237.40237.35237.35636.0237.378066
333432021-03-31 19:59:00-04:00AAPL122.39122.45122.38122.455560.0122.402606
333442021-03-31 19:59:00-04:00MSFT237.40237.53237.40237.531163.0237.473801
\n", "

33345 rows × 8 columns

\n", "
" ], "text/plain": [ " date symbol open high low close \\\n", "0 2021-03-01 04:00:00-05:00 AAPL 124.30 124.56 124.30 124.50 \n", "1 2021-03-01 04:00:00-05:00 MSFT 235.87 236.00 235.87 236.00 \n", "2 2021-03-01 04:01:00-05:00 AAPL 124.56 124.60 124.30 124.30 \n", "3 2021-03-01 04:01:00-05:00 MSFT 236.17 236.17 236.17 236.17 \n", "4 2021-03-01 04:02:00-05:00 AAPL 124.00 124.05 123.78 123.78 \n", "... ... ... ... ... ... ... \n", "33340 2021-03-31 19:57:00-04:00 MSFT 237.28 237.28 237.28 237.28 \n", "33341 2021-03-31 19:58:00-04:00 AAPL 122.36 122.39 122.33 122.39 \n", "33342 2021-03-31 19:58:00-04:00 MSFT 237.40 237.40 237.35 237.35 \n", "33343 2021-03-31 19:59:00-04:00 AAPL 122.39 122.45 122.38 122.45 \n", "33344 2021-03-31 19:59:00-04:00 MSFT 237.40 237.53 237.40 237.53 \n", "\n", " volume vwap \n", "0 12267.0 124.433365 \n", "1 1429.0 235.938887 \n", "2 9439.0 124.481323 \n", "3 104.0 236.161538 \n", "4 4834.0 123.935583 \n", "... ... ... \n", "33340 507.0 237.367870 \n", "33341 3403.0 122.360544 \n", "33342 636.0 237.378066 \n", "33343 5560.0 122.402606 \n", "33344 1163.0 237.473801 \n", "\n", "[33345 rows x 8 columns]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = alpaca.query(\n", " ['AAPL', 'MSFT'], \n", " start_date='3/1/2021', \n", " end_date='4/1/2021', \n", " timeframe='1m'\n", ")\n", "df" ] }, { "cell_type": "markdown", "id": "05c2b17d", "metadata": {}, "source": [ "## Alpaca Crypto\n", "\n", "If you are interested in fetching cryptocurrency data, you can use [AlpacaCrypto](https://www.pybroker.com/en/latest/reference/pybroker.data.html#pybroker.data.AlpacaCrypto). Here's an example of how to use it:" ] }, { "cell_type": "code", "execution_count": 9, "id": "4f0f8826", "metadata": { "ExecuteTime": { "end_time": "2023-08-15T11:36:25.621317500Z", "start_time": "2023-08-15T11:36:23.800161700Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loading bar data...\n", "Loaded bar data: 0:00:06 \n", "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
symboldateopenhighlowclosevolumevwaptrade_count
0BTC/USD2021-01-01 01:00:00-05:0029255.7129338.2529153.5529234.1542.24428929237.2403121243.0
1BTC/USD2021-01-01 02:00:00-05:0029235.6129236.9528905.0029162.5034.50603829078.4231851070.0
2BTC/USD2021-01-01 03:00:00-05:0029162.5029248.5228948.8629076.7727.59680429091.4651551110.0
3BTC/USD2021-01-01 04:00:00-05:0029075.3129372.3229058.0529284.9220.69420029248.730924880.0
4BTC/USD2021-01-01 05:00:00-05:0029291.5429400.0029232.1629286.6316.61764629338.609132742.0
..............................
735BTC/USD2021-01-31 15:00:00-05:0032837.6732964.8732528.5432882.8740.63112232818.1328552197.0
736BTC/USD2021-01-31 16:00:00-05:0032889.0132935.9832554.5932586.6826.67319032737.9752961625.0
737BTC/USD2021-01-31 17:00:00-05:0032599.0033126.3232599.0032998.3525.42256832923.4388931770.0
738BTC/USD2021-01-31 18:00:00-05:0033000.0033263.9432957.1033134.8631.07201733147.0868032203.0
739BTC/USD2021-01-31 19:00:00-05:0033134.0333134.0332303.4432572.0360.46042432552.9378632665.0
\n", "

740 rows × 9 columns

\n", "
" ], "text/plain": [ " symbol date open high low \\\n", "0 BTC/USD 2021-01-01 01:00:00-05:00 29255.71 29338.25 29153.55 \n", "1 BTC/USD 2021-01-01 02:00:00-05:00 29235.61 29236.95 28905.00 \n", "2 BTC/USD 2021-01-01 03:00:00-05:00 29162.50 29248.52 28948.86 \n", "3 BTC/USD 2021-01-01 04:00:00-05:00 29075.31 29372.32 29058.05 \n", "4 BTC/USD 2021-01-01 05:00:00-05:00 29291.54 29400.00 29232.16 \n", ".. ... ... ... ... ... \n", "735 BTC/USD 2021-01-31 15:00:00-05:00 32837.67 32964.87 32528.54 \n", "736 BTC/USD 2021-01-31 16:00:00-05:00 32889.01 32935.98 32554.59 \n", "737 BTC/USD 2021-01-31 17:00:00-05:00 32599.00 33126.32 32599.00 \n", "738 BTC/USD 2021-01-31 18:00:00-05:00 33000.00 33263.94 32957.10 \n", "739 BTC/USD 2021-01-31 19:00:00-05:00 33134.03 33134.03 32303.44 \n", "\n", " close volume vwap trade_count \n", "0 29234.15 42.244289 29237.240312 1243.0 \n", "1 29162.50 34.506038 29078.423185 1070.0 \n", "2 29076.77 27.596804 29091.465155 1110.0 \n", "3 29284.92 20.694200 29248.730924 880.0 \n", "4 29286.63 16.617646 29338.609132 742.0 \n", ".. ... ... ... ... \n", "735 32882.87 40.631122 32818.132855 2197.0 \n", "736 32586.68 26.673190 32737.975296 1625.0 \n", "737 32998.35 25.422568 32923.438893 1770.0 \n", "738 33134.86 31.072017 33147.086803 2203.0 \n", "739 32572.03 60.460424 32552.937863 2665.0 \n", "\n", "[740 rows x 9 columns]" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from pybroker import AlpacaCrypto\n", "\n", "crypto = AlpacaCrypto(\n", " os.environ['ALPACA_API_KEY'], \n", " os.environ['ALPACA_API_SECRET']\n", ")\n", "df = crypto.query('BTC/USD', start_date='1/1/2021', end_date='2/1/2021', timeframe='1h')\n", "df" ] }, { "cell_type": "markdown", "id": "6a308188", "metadata": {}, "source": [ "In the above example, we're querying for hourly data for the BTC/USD currency pair." ] }, { "cell_type": "markdown", "id": "8d7dfaa6", "metadata": {}, "source": [ "## AKShare\n", "\n", "**PyBroker** also includes an [AKShare](https://github.com/akfamily/akshare) ```DataSource``` for fetching **Chinese** stock data. AKShare, a widely-used open-source package, is tailored for obtaining financial data, with a focus on the Chinese market. This free tool provides users with access to higher quality data compared to yfinance for the Chinese market. To use it, you can import [AKShare](https://www.pybroker.com/en/latest/reference/pybroker.ext.data.html#pybroker.ext.data.AKShare):" ] }, { "cell_type": "code", "execution_count": 10, "id": "d66daaee", "metadata": { "ExecuteTime": { "end_time": "2023-08-15T11:36:26.569270700Z", "start_time": "2023-08-15T11:36:25.623249Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loading bar data...\n", "Loaded bar data: 0:00:10 \n", "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
datesymbolopenhighlowclosevolume
02021-03-01000001.SZ21.5421.6821.1821.451125387
12021-03-01600000.SH10.5910.6410.5010.58547461
22021-03-02000001.SZ21.6222.1521.2621.651473425
32021-03-02600000.SH10.6110.7010.3610.47747631
42021-03-03000001.SZ21.5823.0821.4623.011919635
........................
9692023-02-27600000.SH7.167.207.167.16158006
9702023-02-28000001.SZ13.7513.8513.6113.78607936
9712023-02-28600000.SH7.187.207.147.18174481
9722023-03-01000001.SZ13.8014.1913.7414.171223452
9732023-03-01600000.SH7.177.277.177.26256613
\n", "

974 rows × 7 columns

\n", "
" ], "text/plain": [ " date symbol open high low close volume\n", "0 2021-03-01 000001.SZ 21.54 21.68 21.18 21.45 1125387\n", "1 2021-03-01 600000.SH 10.59 10.64 10.50 10.58 547461\n", "2 2021-03-02 000001.SZ 21.62 22.15 21.26 21.65 1473425\n", "3 2021-03-02 600000.SH 10.61 10.70 10.36 10.47 747631\n", "4 2021-03-03 000001.SZ 21.58 23.08 21.46 23.01 1919635\n", ".. ... ... ... ... ... ... ...\n", "969 2023-02-27 600000.SH 7.16 7.20 7.16 7.16 158006\n", "970 2023-02-28 000001.SZ 13.75 13.85 13.61 13.78 607936\n", "971 2023-02-28 600000.SH 7.18 7.20 7.14 7.18 174481\n", "972 2023-03-01 000001.SZ 13.80 14.19 13.74 14.17 1223452\n", "973 2023-03-01 600000.SH 7.17 7.27 7.17 7.26 256613\n", "\n", "[974 rows x 7 columns]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from pybroker.ext.data import AKShare\n", "\n", "akshare = AKShare()\n", "# You can substitute 000001.SZ with 000001, and it will still work!\n", "# and you can set start_date as \"20210301\" format\n", "# You can also set adjust to 'qfq' or 'hfq' to adjust the data, \n", "# and set timeframe to '1d', '1w' to get daily, weekly data\n", "df = akshare.query(\n", " symbols=['000001.SZ', '600000.SH'], \n", " start_date='3/1/2021', \n", " end_date='3/1/2023',\n", " adjust=\"\", \n", " timeframe=\"1d\",\n", ")\n", "df" ] }, { "cell_type": "markdown", "id": "fa593912", "metadata": {}, "source": [ "**NOTE**: If the above causes a ``Native library not available`` error and you still want to use AKShare, then [see this issue for details on how to resolve it](https://github.com/edtechre/pybroker/issues/36#issuecomment-1605910339)." ] }, { "cell_type": "markdown", "id": "893620ec", "metadata": {}, "source": [ "[In the next notebook, we'll take a look at how to use DataSources to backtest a simple trading strategy](https://www.pybroker.com/en/latest/notebooks/2.%20Backtesting%20a%20Strategy.html)." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.16" } }, "nbformat": 4, "nbformat_minor": 5 }